Image segmentation is a machine learning tool that takes an image and splits it into multiple parts. This tool can be helpful in medical imaging, such as x-rays and magnetic resonance imaging (MRI); however, these pictures are often cluttered, so the organs are difficult to identify. Image segmentation can help identify the organ clearly from the cluttered image, which is very useful for healthcare personnel.
Our dataset includes imaging of patients’ lungs. We designed a convolutional neural network and used the UNET architecture to design a neural network that can identify the patient’s lungs from the image. A more in-depth explanation of the convolutional neural network and UNET can be found below.
The input data of this neural network includes images of many patients’ lungs. As shown below, the images are taken at different heights, and some patients have lungs with slightly different shapes. All of these images will be put into the model along with the mask, which is the correct location of the lungs. Additionally, slight transformations of these images will be put into the model to increase the robustness of the model, which is called “generating images.” The slight transformations are typically horizontal or vertical flips of the image.
Next, we fit the image data with a UNET model. The UNET is a convolutional neural network with a specific set of layers and neurons, which is specifically designed to be good at image segmentation. Generally, it uses convolutional layers and pooling layers to lower the image into a one-dimensional list of data. Then, a neural network is applied to that list of data. Next, operations called “up-convolutions” are used to expand the image again, resulting in an image segmentation. The UNET architecture is shown below.
pic2
The code used to fit this UNET model is shown below.
# unet function
# default parameter - if no input size is given, assume it is (256,256,3)
def unet(input_size=(256, 256, 3)):
inputs = Input(input_size)
# First DownConvolution / Encoder Leg will begin, so start with Conv2D
conv1 = Conv2D(filters=64, kernel_size=(3, 3), padding="same")(inputs)
bn1 = Activation("relu")(conv1)
conv1 = Conv2D(filters=64, kernel_size=(3, 3), padding="same")(bn1)
bn1 = BatchNormalization(axis=3)(conv1)
bn1 = Activation("relu")(bn1)
pool1 = MaxPooling2D(pool_size=(2, 2))(bn1)
conv2 = Conv2D(filters=128, kernel_size=(3, 3), padding="same")(pool1)
bn2 = Activation("relu")(conv2)
conv2 = Conv2D(filters=128, kernel_size=(3, 3), padding="same")(bn2)
bn2 = BatchNormalization(axis=3)(conv2)
bn2 = Activation("relu")(bn2)
pool2 = MaxPooling2D(pool_size=(2, 2))(bn2)
conv3 = Conv2D(filters=256, kernel_size=(3, 3), padding="same")(pool2)
bn3 = Activation("relu")(conv3)
conv3 = Conv2D(filters=256, kernel_size=(3, 3), padding="same")(bn3)
bn3 = BatchNormalization(axis=3)(conv3)
bn3 = Activation("relu")(bn3)
pool3 = MaxPooling2D(pool_size=(2, 2))(bn3)
conv4 = Conv2D(filters=512, kernel_size=(3, 3), padding="same")(pool3)
bn4 = Activation("relu")(conv4)
conv4 = Conv2D(filters=512, kernel_size=(3, 3), padding="same")(bn4)
bn4 = BatchNormalization(axis=3)(conv4)
bn4 = Activation("relu")(bn4)
pool4 = MaxPooling2D(pool_size=(2, 2))(bn4)
conv5 = Conv2D(filters=1024, kernel_size=(3, 3), padding="same")(pool4)
bn5 = Activation("relu")(conv5)
conv5 = Conv2D(filters=1024, kernel_size=(3, 3), padding="same")(bn5)
bn5 = BatchNormalization(axis=3)(conv5)
bn5 = Activation("relu")(bn5)
# Now, up-convolution
up6 = concatenate(
[
Conv2DTranspose(512, kernel_size=(2, 2), strides=(2, 2), padding="same")(
bn5
),
conv4,
],
axis=3,
)
""" After every concatenation we again apply two consecutive regular convolutions so that the model can learn to assemble a more precise output """
conv6 = Conv2D(filters=512, kernel_size=(3, 3), padding="same")(up6)
bn6 = Activation("relu")(conv6)
conv6 = Conv2D(filters=512, kernel_size=(3, 3), padding="same")(bn6)
bn6 = BatchNormalization(axis=3)(conv6)
bn6 = Activation("relu")(bn6)
up7 = concatenate(
[
Conv2DTranspose(256, kernel_size=(2, 2), strides=(2, 2), padding="same")(
bn6
),
conv3,
],
axis=3,
)
conv7 = Conv2D(filters=256, kernel_size=(3, 3), padding="same")(up7)
bn7 = Activation("relu")(conv7)
conv7 = Conv2D(filters=256, kernel_size=(3, 3), padding="same")(bn7)
bn7 = BatchNormalization(axis=3)(conv7)
bn7 = Activation("relu")(bn7)
up8 = concatenate(
[
Conv2DTranspose(128, kernel_size=(2, 2), strides=(2, 2), padding="same")(
bn7
),
conv2,
],
axis=3,
)
conv8 = Conv2D(filters=128, kernel_size=(3, 3), padding="same")(up8)
bn8 = Activation("relu")(conv8)
conv8 = Conv2D(filters=128, kernel_size=(3, 3), padding="same")(bn8)
bn8 = BatchNormalization(axis=3)(conv8)
bn8 = Activation("relu")(bn8)
up9 = concatenate(
[
Conv2DTranspose(64, kernel_size=(2, 2), strides=(2, 2), padding="same")(
bn8
),
conv1,
],
axis=3,
)
conv9 = Conv2D(filters=64, kernel_size=(3, 3), padding="same")(up9)
bn9 = Activation("relu")(conv9)
conv9 = Conv2D(filters=64, kernel_size=(3, 3), padding="same")(bn9)
bn9 = BatchNormalization(axis=3)(conv9)
bn9 = Activation("relu")(bn9)
conv10 = Conv2D(filters=1, kernel_size=(1, 1), activation="sigmoid")(bn9)
return Model(inputs=[inputs], outputs=[conv10]) # give me predicted y
The Dice Coefficient of two sets is measure of their intersection scaled by their size (giving a value in the range 0 to 1): \[ Dice(\hat Y,Y) = \frac{2|\hat Y \cap Y|}{|\hat Y|+|Y|} \] After the UNET model is fitted, we can see the accuracy of it. Our UNET has an accuracy of approximately 0.98 after around 50 iterations of the model.
pic3
Some examples of the outputs are shown below. The first image is the original, and the second image is the mask. The final image is what our UNET predicts. As you can see, the UNET accurately depicts the original image.
pic4
pic4